Classifying Image Galleries into a Taxonomy Using Metadata and Wikipedia
نویسندگان
چکیده
This paper presents a method for the hierarchical classification of image galleries into a taxonomy. The proposed method links textual gallery metadata to Wikipedia pages and categories. Entity extraction from metadata, entity ranking, and selection of categories is based on Wikipedia and does not require labeled training data. The resulting system performs well above a random baseline, and achieves a (micro-averaged) F-score of 0.59 on the 9 top categories of the taxonomy and 0.40 when using all 57 categories.
منابع مشابه
Classifying the Wikipedia Articles into the OpenCyc Taxonomy
This article presents a method of classification of the Wikipedia articles into the taxonomy of OpenCyc. This method utilises several sources of the classification information, namely the Wikipedia category system, the infoboxes attached to the articles, the first sentences of the articles, treated as their definitions and the direct mapping between the articles and the Cyc symbols. The classif...
متن کاملGenerating Image Captions using Topic Focused Multi-document Summarization
In the near future digital cameras will come standardly equipped with GPS and compass and will automatically add global position and direction information to the metadata of every picture taken. Can we use this information, together with information from geographical information systems and the Web more generally, to caption images automatically? This challenge is being pursued in the TRIPOD pr...
متن کاملClassifying Taxonomic Relations between Pairs of Wikipedia Articles
Natural language generation systems rely on taxonomic thesauri for tasks such as lexical choice and aggregation. WordNet is one such taxonomy, but it is limited in size. Motivated by the needs of a generation system in the scientific literature domain, we present a method for building a taxonomic thesaurus from Wikipedia articles, where each article represents a potential concept in the taxonom...
متن کاملBlind Relevance Feedback for the ImageCLEF Wikipedia Retrieval Task
In this paper we will describe Berkeley’s approach to the ImageCLEF Wikipedia Retrieval task for 2010. Our approach to this task was primarily to use text-based searches on the contents of the Wikipedia image metadata records. In addition we submitted one run using a database derived from the provided “bag.xml” set of 5000 descriptor “words” for each image and query example images. We had also ...
متن کاملDCU at WikipediaMM 2009: Document Expansion from Wikipedia Abstracts
In this paper, we describe our participation in the WikipediaMM task at CLEF 2009. Our main efforts concern the expansion of the image metadata from the Wikipedia abstracts collection DBpedia. Since the metadata is short for retrieval by query words, we decided to expand the metadata using a typical query expansion method. In our experiments, we use the Rocchio algorithm for document expansion....
متن کامل